A Multi - Threaded Implementation of
نویسندگان
چکیده
Data-Parallelism Dean Engelhardt and Andrew Wendelborn Department of Computer Science University of Adelaide Adelaide, SA 5005, Australia. E-mail: fdean,[email protected] Abstract In previous work, we have proposed a multithreaded execution model for describing nested data-parallelism on distributed multiprocessors in a fashion generic upon the partitioning of data aggregates within the system. This paper demonstrates an approach which uses this abstract model as the basis for a runtime system for a data-parallel functional language. We describe an active message based implementation of the model on the Thinking Machines CM-5 and also consider various issues related to e cient compilation targetting such a platform. Several issues central to the performance of the model are addressed. A hybrid scheme for thread synchronization and data ow is presented which incorporates a highly e cient mode of matching and one permitting the most general forms of interaction. We also describe a generic thread library of data-parallel primitive operations which operate independently of data partitioning. Finally, we detail work undertaken to instrument our execution environment for visualization and tracing.
منابع مشابه
An open source C++ implementation of multi-threaded Gaussian mixture models, k-means and expectation maximisation
Modelling of multivariate densities is a core component in many signal processing, pattern recognition and machine learning applications. The modelling is often done via Gaussian mixture models (GMMs), which use computationally expensive and potentially unstable training algorithms. We provide an overview of a fast and robust implementation of GMMs in the C++ language, employing multi-threaded ...
متن کاملA Multi-threaded Approach to Simulated Soccer Agents for the RoboCup Competition
To meet the timing requirements set by the RoboCup soccer server simulator, this paper proposes a multi-threaded approach to simulated soccer agents for the RoboCup competition. At its higher level each agent works at three distinct phases: sensing, thinking and acting. Instead of the traditional single threaded approaches, POSIX threads have been used here to break down these phases and implem...
متن کاملThe Effect of Cache on the Performance of a Multi-Threaded Pipelined RISC Processor
This paper examines the effects of multithreaded pipelining on the CPI (cycles per instruction) of a RISC processor. The desired CPI in a conventional (single-threaded) RISC processor is one instruction per cycle. However, the CPI is typically more than one because of data hazards, control hazards, and resource hazards in the pipeline. A multi-threaded processor performs a context switch betwee...
متن کاملActiveMonitor: Non-blocking Monitor Executions for Increased Parallelism
We present a set of novel ideas on design and implementation of monitor objects for multi-threaded programs. Our approach has two main goals: (a) increase parallelism in monitor objects and thus provide performance gains (shorter runtimes) for multi-threaded programs, and (b) introduce constructs that allow programmers to easily write monitor-based multi-threaded programs that can achieve these...
متن کاملA Scalable Concurrent malloc(3) Implementation for FreeBSD
The FreeBSD project has been engaged in ongoing work to provide scalable support for multi-processor computer systems since version 5. Sufficient progress has been made that the C library’s malloc(3) memory allocator is now a potential bottleneck for multi-threaded applications running on multiprocessor systems. In this paper, I present a new memory allocator that builds on the state of the art...
متن کاملUsing RenderScript and RCUDA for Compute Intensive Tasks on Mobile Devices: a Case Study
The processing power of mobile devices is continuously increasing. In this paper we perform a case study in which we assess three different programming models that can be used to leverage this processing power for compute intensive tasks. We use an imaging algorithm and compare a reference implementation of this algorithm based on OpenCV with a multi threaded RenderScript implementation and an ...
متن کامل